AITopics | gap-dependent sample complexity

Collaborating Authors

gap-dependent sample complexity

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Planning in Markov Decision Processes with Gap-Dependent Sample Complexity

Neural Information Processing SystemsDec-23-2025, 18:13:23 GMT

We propose MDP-GapE, a new trajectory-based Monte-Carlo Tree Search algorithm for planning in a Markov Decision Process in which transitions have a finite support. We prove an upper bound on the number of sampled trajectories needed for MDP-GapE to identify a near-optimal action with high probability. This problem-dependent result is expressed in terms of the sub-optimality gaps of the state-action pairs that are visited during exploration. Our experiments reveal that MDP-GapE is also effective in practice, in contrast with other algorithms with sample complexity guarantees in the fixed-confidence setting, that are mostly theoretical.

gap-dependent sample complexity, markov decision process, name change, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.62)

Add feedback

Review for NeurIPS paper: Planning in Markov Decision Processes with Gap-Dependent Sample Complexity

Neural Information Processing SystemsJan-21-2025, 13:04:00 GMT

Additional Feedback: Post-rebuttal The authors addressed some of my concerns. As the authors would redesign some of the experiments in the revision, I'd raise my score to 6. Comments and questions: 1. Are there any lower bound results on the sample complexity of planning? Are there any particular reasons, and what is the high-level idea of this algorithm? If I understand correctly this rule is to get the gap-dependent sample complexity. What if we use the simple greedy policy for the first action, and what will go wrong in the proof?

algorithm, gap-dependent sample complexity, sample complexity, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.40)

Add feedback